Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 291835 |
| Missing cells | 44396 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 31.2 MiB |
| Average record size in memory | 112.0 B |
Variable types
| Categorical | 4 |
|---|---|
| DateTime | 1 |
| Numeric | 9 |
VERSIE has constant value "1.0" | Constant |
DATUM_BESTAND has constant value "2022-04-12" | Constant |
PEILDATUM has constant value "2022-04-01" | Constant |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1770 distinct values | High cardinality |
BEHANDELEND_SPECIALISME_CD is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with BEHANDELEND_SPECIALISME_CD and 1 other fields | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
VERSIE is highly correlated with PEILDATUM and 1 other fields | High correlation |
PEILDATUM is highly correlated with VERSIE and 1 other fields | High correlation |
DATUM_BESTAND is highly correlated with VERSIE and 1 other fields | High correlation |
ZORGPRODUCT_CD is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with ZORGPRODUCT_CD and 1 other fields | High correlation |
GEMIDDELDE_VERKOOPPRIJS has 44396 (15.2%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.01716558) | Skewed |
Reproduction
| Analysis started | 2022-05-02 01:52:58.833041 |
|---|---|
| Analysis finished | 2022-05-02 01:53:22.123432 |
| Duration | 23.29 seconds |
| Software version | pandas-profiling v3.1.1 |
| Download configuration | config.json |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 875505 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 291835 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1.0 | 291835 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 291835 | |
| . | 291835 | |
| 0 | 291835 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 583670 | |
| Other Punctuation | 291835 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 291835 | |
| 0 | 291835 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 291835 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 875505 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 291835 | |
| . | 291835 | |
| 0 | 291835 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 875505 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 291835 | |
| . | 291835 | |
| 0 | 291835 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| 2022-04-12 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 2918350 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-04-12 |
|---|---|
| 2nd row | 2022-04-12 |
| 3rd row | 2022-04-12 |
| 4th row | 2022-04-12 |
| 5th row | 2022-04-12 |
Common Values
| Value | Count | Frequency (%) |
| 2022-04-12 | 291835 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2022-04-12 | 291835 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1167340 | |
| 0 | 583670 | |
| - | 583670 | |
| 4 | 291835 | 10.0% |
| 1 | 291835 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2334680 | |
| Dash Punctuation | 583670 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1167340 | |
| 0 | 583670 | |
| 4 | 291835 | 12.5% |
| 1 | 291835 | 12.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 583670 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2918350 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1167340 | |
| 0 | 583670 | |
| - | 583670 | |
| 4 | 291835 | 10.0% |
| 1 | 291835 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2918350 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1167340 | |
| 0 | 583670 | |
| - | 583670 | |
| 4 | 291835 | 10.0% |
| 1 | 291835 | 10.0% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| 2022-04-01 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 2918350 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-04-01 |
|---|---|
| 2nd row | 2022-04-01 |
| 3rd row | 2022-04-01 |
| 4th row | 2022-04-01 |
| 5th row | 2022-04-01 |
Common Values
| Value | Count | Frequency (%) |
| 2022-04-01 | 291835 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2022-04-01 | 291835 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 875505 | |
| 0 | 875505 | |
| - | 583670 | |
| 4 | 291835 | 10.0% |
| 1 | 291835 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2334680 | |
| Dash Punctuation | 583670 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 875505 | |
| 0 | 875505 | |
| 4 | 291835 | 12.5% |
| 1 | 291835 | 12.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 583670 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2918350 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 875505 | |
| 0 | 875505 | |
| - | 583670 | |
| 4 | 291835 | 10.0% |
| 1 | 291835 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2918350 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 875505 | |
| 0 | 875505 | |
| - | 583670 | |
| 4 | 291835 | 10.0% |
| 1 | 291835 | 10.0% |
JAAR
Date
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| Minimum | 2012-01-01 00:00:00 |
|---|---|
| Maximum | 2022-01-01 00:00:00 |
| Distinct | 27 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 423.5461991 |
| Minimum | 301 |
|---|---|
| Maximum | 8418 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 301 |
|---|---|
| 5-th percentile | 302 |
| Q1 | 305 |
| median | 313 |
| Q3 | 322 |
| 95-th percentile | 335 |
| Maximum | 8418 |
| Range | 8117 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 928.9834954 |
|---|---|
| Coefficient of variation (CV) | 2.193346316 |
| Kurtosis | 69.92118548 |
| Mean | 423.5461991 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 8.47403304 |
| Sum | 123605605 |
| Variance | 863010.3347 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 305 | 41476 | |
| 313 | 37733 | |
| 303 | 33646 | |
| 330 | 23226 | 8.0% |
| 316 | 19881 | 6.8% |
| 308 | 15571 | 5.3% |
| 306 | 12204 | 4.2% |
| 324 | 12090 | 4.1% |
| 301 | 11721 | 4.0% |
| 304 | 9477 | 3.2% |
| Other values (17) | 74810 |
| Value | Count | Frequency (%) |
| 301 | 11721 | 4.0% |
| 302 | 6371 | 2.2% |
| 303 | 33646 | |
| 304 | 9477 | 3.2% |
| 305 | 41476 | |
| 306 | 12204 | 4.2% |
| 307 | 5053 | 1.7% |
| 308 | 15571 | 5.3% |
| 310 | 3256 | 1.1% |
| 313 | 37733 |
| Value | Count | Frequency (%) |
| 8418 | 3880 | 1.3% |
| 1900 | 190 | 0.1% |
| 390 | 765 | 0.3% |
| 389 | 3118 | 1.1% |
| 362 | 4140 | 1.4% |
| 361 | 2084 | 0.7% |
| 335 | 2961 | 1.0% |
| 330 | 23226 | |
| 329 | 759 | 0.3% |
| 328 | 6354 | 2.2% |
| Distinct | 1770 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| 101 | 1236 |
|---|---|
| 402 | 1193 |
| 403 | 1164 |
| 301 | 1162 |
| 203 | 1100 |
| Other values (1765) |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.351479432 |
| Min length | 2 |
Characters and Unicode
| Total characters | 978079 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 0215 |
|---|---|
| 2nd row | 0114 |
| 3rd row | 0316 |
| 4th row | 0218 |
| 5th row | 0611 |
Common Values
| Value | Count | Frequency (%) |
| 101 | 1236 | 0.4% |
| 402 | 1193 | 0.4% |
| 403 | 1164 | 0.4% |
| 301 | 1162 | 0.4% |
| 203 | 1100 | 0.4% |
| 201 | 1092 | 0.4% |
| 401 | 973 | 0.3% |
| 404 | 967 | 0.3% |
| 802 | 953 | 0.3% |
| 409 | 939 | 0.3% |
| Other values (1760) | 281056 |
Length
| Value | Count | Frequency (%) |
| 101 | 1236 | 0.4% |
| 402 | 1193 | 0.4% |
| 403 | 1164 | 0.4% |
| 301 | 1162 | 0.4% |
| 203 | 1100 | 0.4% |
| 201 | 1092 | 0.4% |
| 401 | 973 | 0.3% |
| 404 | 967 | 0.3% |
| 802 | 953 | 0.3% |
| 409 | 939 | 0.3% |
| Other values (1760) | 281056 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 187421 | |
| 0 | 178766 | |
| 2 | 129734 | |
| 3 | 106029 | |
| 5 | 75348 | |
| 9 | 70560 | 7.2% |
| 4 | 69530 | 7.1% |
| 7 | 57616 | 5.9% |
| 6 | 51166 | 5.2% |
| 8 | 42043 | 4.3% |
| Other values (15) | 9866 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 968213 | |
| Uppercase Letter | 9866 | 1.0% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 1838 | |
| M | 1650 | |
| B | 1184 | |
| E | 839 | |
| Z | 807 | |
| D | 670 | 6.8% |
| A | 640 | 6.5% |
| F | 620 | 6.3% |
| C | 331 | 3.4% |
| K | 318 | 3.2% |
| Other values (5) | 969 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 187421 | |
| 0 | 178766 | |
| 2 | 129734 | |
| 3 | 106029 | |
| 5 | 75348 | |
| 9 | 70560 | 7.3% |
| 4 | 69530 | 7.2% |
| 7 | 57616 | 6.0% |
| 6 | 51166 | 5.3% |
| 8 | 42043 | 4.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 968213 | |
| Latin | 9866 | 1.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 1838 | |
| M | 1650 | |
| B | 1184 | |
| E | 839 | |
| Z | 807 | |
| D | 670 | 6.8% |
| A | 640 | 6.5% |
| F | 620 | 6.3% |
| C | 331 | 3.4% |
| K | 318 | 3.2% |
| Other values (5) | 969 |
Common
| Value | Count | Frequency (%) |
| 1 | 187421 | |
| 0 | 178766 | |
| 2 | 129734 | |
| 3 | 106029 | |
| 5 | 75348 | |
| 9 | 70560 | 7.3% |
| 4 | 69530 | 7.2% |
| 7 | 57616 | 6.0% |
| 6 | 51166 | 5.3% |
| 8 | 42043 | 4.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 978079 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 187421 | |
| 0 | 178766 | |
| 2 | 129734 | |
| 3 | 106029 | |
| 5 | 75348 | |
| 9 | 70560 | 7.2% |
| 4 | 69530 | 7.1% |
| 7 | 57616 | 5.9% |
| 6 | 51166 | 5.2% |
| 8 | 42043 | 4.3% |
| Other values (15) | 9866 | 1.0% |
| Distinct | 5965 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 442329517.3 |
| Minimum | 10501002 |
|---|---|
| Maximum | 998418081 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 10501002 |
|---|---|
| 5-th percentile | 28999038 |
| Q1 | 99899003 |
| median | 149899003 |
| Q3 | 990004004 |
| 95-th percentile | 990516027 |
| Maximum | 998418081 |
| Range | 987917079 |
| Interquartile range (IQR) | 890105001 |
Descriptive statistics
| Standard deviation | 429418787.8 |
|---|---|
| Coefficient of variation (CV) | 0.9708119647 |
| Kurtosis | -1.744660847 |
| Mean | 442329517.3 |
| Median Absolute Deviation (MAD) | 119999998 |
| Skewness | 0.460012133 |
| Sum | 1.290872347 × 1014 |
| Variance | 1.844004953 × 1017 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 990004009 | 2114 | 0.7% |
| 990004007 | 2083 | 0.7% |
| 990003004 | 2065 | 0.7% |
| 990004006 | 1698 | 0.6% |
| 990356076 | 1522 | 0.5% |
| 990356073 | 1409 | 0.5% |
| 990003007 | 1340 | 0.5% |
| 131999228 | 1274 | 0.4% |
| 131999164 | 1261 | 0.4% |
| 199299013 | 1209 | 0.4% |
| Other values (5955) | 275860 |
| Value | Count | Frequency (%) |
| 10501002 | 8 | |
| 10501003 | 10 | |
| 10501004 | 10 | |
| 10501005 | 11 | |
| 10501007 | 3 | < 0.1% |
| 10501008 | 11 | |
| 10501010 | 10 | |
| 10501011 | 3 | < 0.1% |
| 11101002 | 9 | |
| 11101003 | 10 |
| Value | Count | Frequency (%) |
| 998418081 | 144 | |
| 998418080 | 128 | |
| 998418079 | 35 | < 0.1% |
| 998418077 | 7 | < 0.1% |
| 998418076 | 7 | < 0.1% |
| 998418075 | 6 | < 0.1% |
| 998418074 | 188 | |
| 998418073 | 188 | |
| 998418072 | 7 | < 0.1% |
| 998418071 | 7 | < 0.1% |
AANTAL_PAT_PER_ZPD
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 9705 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 522.6910789 |
| Minimum | 1 |
|---|---|
| Maximum | 164654 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 14 |
| Q3 | 106 |
| 95-th percentile | 1783.3 |
| Maximum | 164654 |
| Range | 164653 |
| Interquartile range (IQR) | 103 |
Descriptive statistics
| Standard deviation | 3201.824143 |
|---|---|
| Coefficient of variation (CV) | 6.125652938 |
| Kurtosis | 388.5361534 |
| Mean | 522.6910789 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 16.3777953 |
| Sum | 152539551 |
| Variance | 10251677.84 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 47806 | 16.4% |
| 2 | 23445 | 8.0% |
| 3 | 15283 | 5.2% |
| 4 | 11275 | 3.9% |
| 5 | 8854 | 3.0% |
| 6 | 7387 | 2.5% |
| 7 | 6238 | 2.1% |
| 8 | 5255 | 1.8% |
| 9 | 4818 | 1.7% |
| 10 | 4241 | 1.5% |
| Other values (9695) | 157233 |
| Value | Count | Frequency (%) |
| 1 | 47806 | |
| 2 | 23445 | |
| 3 | 15283 | 5.2% |
| 4 | 11275 | 3.9% |
| 5 | 8854 | 3.0% |
| 6 | 7387 | 2.5% |
| 7 | 6238 | 2.1% |
| 8 | 5255 | 1.8% |
| 9 | 4818 | 1.7% |
| 10 | 4241 | 1.5% |
| Value | Count | Frequency (%) |
| 164654 | 1 | |
| 155884 | 1 | |
| 154270 | 1 | |
| 151286 | 1 | |
| 144725 | 1 | |
| 119281 | 1 | |
| 118040 | 1 | |
| 115941 | 1 | |
| 110520 | 1 | |
| 109675 | 1 |
AANTAL_SUBTRAJECT_PER_ZPD
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 10368 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 615.7620882 |
| Minimum | 1 |
|---|---|
| Maximum | 239919 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 15 |
| Q3 | 116 |
| 95-th percentile | 2026 |
| Maximum | 239919 |
| Range | 239918 |
| Interquartile range (IQR) | 113 |
Descriptive statistics
| Standard deviation | 4101.048618 |
|---|---|
| Coefficient of variation (CV) | 6.660118732 |
| Kurtosis | 704.0325055 |
| Mean | 615.7620882 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 21.01716558 |
| Sum | 179700929 |
| Variance | 16818599.77 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 46020 | 15.8% |
| 2 | 23037 | 7.9% |
| 3 | 15141 | 5.2% |
| 4 | 11042 | 3.8% |
| 5 | 8792 | 3.0% |
| 6 | 7381 | 2.5% |
| 7 | 6207 | 2.1% |
| 8 | 5188 | 1.8% |
| 9 | 4751 | 1.6% |
| 10 | 4240 | 1.5% |
| Other values (10358) | 160036 |
| Value | Count | Frequency (%) |
| 1 | 46020 | |
| 2 | 23037 | |
| 3 | 15141 | 5.2% |
| 4 | 11042 | 3.8% |
| 5 | 8792 | 3.0% |
| 6 | 7381 | 2.5% |
| 7 | 6207 | 2.1% |
| 8 | 5188 | 1.8% |
| 9 | 4751 | 1.6% |
| 10 | 4240 | 1.5% |
| Value | Count | Frequency (%) |
| 239919 | 1 | |
| 232431 | 1 | |
| 232118 | 1 | |
| 228047 | 1 | |
| 227606 | 1 | |
| 226776 | 1 | |
| 224099 | 1 | |
| 218623 | 1 | |
| 214231 | 1 | |
| 204769 | 1 |
AANTAL_PAT_PER_DIAG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 8615 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7853.93679 |
| Minimum | 1 |
|---|---|
| Maximum | 227540 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 44 |
| Q1 | 425 |
| median | 1785 |
| Q3 | 6606 |
| 95-th percentile | 37371 |
| Maximum | 227540 |
| Range | 227539 |
| Interquartile range (IQR) | 6181 |
Descriptive statistics
| Standard deviation | 18022.98007 |
|---|---|
| Coefficient of variation (CV) | 2.294770197 |
| Kurtosis | 32.85159273 |
| Mean | 7853.93679 |
| Median Absolute Deviation (MAD) | 1618 |
| Skewness | 4.979202913 |
| Sum | 2292053643 |
| Variance | 324827810.6 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21 | 455 | 0.2% |
| 25 | 437 | 0.1% |
| 9 | 430 | 0.1% |
| 8 | 417 | 0.1% |
| 14 | 408 | 0.1% |
| 1 | 395 | 0.1% |
| 26 | 392 | 0.1% |
| 37 | 388 | 0.1% |
| 6 | 385 | 0.1% |
| 19 | 379 | 0.1% |
| Other values (8605) | 287749 |
| Value | Count | Frequency (%) |
| 1 | 395 | |
| 2 | 359 | |
| 3 | 313 | |
| 4 | 367 | |
| 5 | 311 | |
| 6 | 385 | |
| 7 | 340 | |
| 8 | 417 | |
| 9 | 430 | |
| 10 | 273 |
| Value | Count | Frequency (%) |
| 227540 | 23 | |
| 213802 | 24 | |
| 213754 | 17 | |
| 213538 | 25 | |
| 211599 | 17 | |
| 210437 | 19 | |
| 205351 | 17 | |
| 200605 | 16 | |
| 198530 | 20 | |
| 189109 | 19 |
AANTAL_SUBTRAJECT_PER_DIAG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 9499 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11250.18617 |
| Minimum | 1 |
|---|---|
| Maximum | 368507 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 56 |
| Q1 | 562 |
| median | 2458 |
| Q3 | 9299 |
| 95-th percentile | 53035 |
| Maximum | 368507 |
| Range | 368506 |
| Interquartile range (IQR) | 8737 |
Descriptive statistics
| Standard deviation | 26658.00318 |
|---|---|
| Coefficient of variation (CV) | 2.369561069 |
| Kurtosis | 36.47929954 |
| Mean | 11250.18617 |
| Median Absolute Deviation (MAD) | 2248 |
| Skewness | 5.227601553 |
| Sum | 3283198082 |
| Variance | 710649133.5 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 33 | 339 | 0.1% |
| 1 | 320 | 0.1% |
| 10 | 319 | 0.1% |
| 77 | 317 | 0.1% |
| 17 | 316 | 0.1% |
| 25 | 311 | 0.1% |
| 57 | 307 | 0.1% |
| 6 | 306 | 0.1% |
| 46 | 303 | 0.1% |
| 38 | 303 | 0.1% |
| Other values (9489) | 288694 |
| Value | Count | Frequency (%) |
| 1 | 320 | |
| 2 | 296 | |
| 3 | 264 | |
| 4 | 273 | |
| 5 | 266 | |
| 6 | 306 | |
| 7 | 272 | |
| 8 | 272 | |
| 9 | 246 | |
| 10 | 319 |
| Value | Count | Frequency (%) |
| 368507 | 23 | |
| 348526 | 25 | |
| 341695 | 19 | |
| 335999 | 24 | |
| 323792 | 20 | |
| 314674 | 17 | |
| 310780 | 17 | |
| 298653 | 17 | |
| 289047 | 16 | |
| 274559 | 17 |
AANTAL_PAT_PER_SPC
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 287 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 685775.8265 |
| Minimum | 2 |
|---|---|
| Maximum | 1489487 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 43789 |
| Q1 | 350206 |
| median | 753965 |
| Q3 | 1017453 |
| 95-th percentile | 1336144 |
| Maximum | 1489487 |
| Range | 1489485 |
| Interquartile range (IQR) | 667247 |
Descriptive statistics
| Standard deviation | 410261.8175 |
|---|---|
| Coefficient of variation (CV) | 0.5982447932 |
| Kurtosis | -1.064700592 |
| Mean | 685775.8265 |
| Median Absolute Deviation (MAD) | 309630 |
| Skewness | -0.04365978264 |
| Sum | 2.001333883 × 1011 |
| Variance | 1.683147589 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 880960 | 5102 | 1.7% |
| 874266 | 4354 | 1.5% |
| 843991 | 4348 | 1.5% |
| 894394 | 4333 | 1.5% |
| 880555 | 4273 | 1.5% |
| 894933 | 4212 | 1.4% |
| 753965 | 4083 | 1.4% |
| 1084063 | 3890 | 1.3% |
| 1101059 | 3864 | 1.3% |
| 1063595 | 3851 | 1.3% |
| Other values (277) | 249525 |
| Value | Count | Frequency (%) |
| 2 | 1 | < 0.1% |
| 4 | 7 | < 0.1% |
| 6 | 4 | < 0.1% |
| 7 | 5 | < 0.1% |
| 10 | 6 | < 0.1% |
| 12 | 14 | |
| 17 | 4 | < 0.1% |
| 21 | 15 | |
| 22 | 6 | < 0.1% |
| 24 | 19 |
| Value | Count | Frequency (%) |
| 1489487 | 2976 | |
| 1450610 | 3054 | |
| 1421820 | 3564 | |
| 1345187 | 3543 | |
| 1336144 | 3439 | |
| 1332858 | 3546 | |
| 1317333 | 3463 | |
| 1296714 | 1181 | 0.4% |
| 1283065 | 3577 | |
| 1262581 | 1201 | 0.4% |
AANTAL_SUBTRAJECT_PER_SPC
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 288 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1101307.53 |
| Minimum | 2 |
|---|---|
| Maximum | 2666725 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 49801 |
| Q1 | 520135 |
| median | 1091367 |
| Q3 | 1813055 |
| 95-th percentile | 2557581 |
| Maximum | 2666725 |
| Range | 2666723 |
| Interquartile range (IQR) | 1292920 |
Descriptive statistics
| Standard deviation | 727351.4007 |
|---|---|
| Coefficient of variation (CV) | 0.6604435009 |
| Kurtosis | -0.8156803321 |
| Mean | 1101307.53 |
| Median Absolute Deviation (MAD) | 627405 |
| Skewness | 0.3061909623 |
| Sum | 3.21400083 × 1011 |
| Variance | 5.290400601 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1211813 | 5102 | 1.7% |
| 1281742 | 4354 | 1.5% |
| 1216294 | 4348 | 1.5% |
| 1315716 | 4333 | 1.5% |
| 1300618 | 4273 | 1.5% |
| 1336472 | 4212 | 1.4% |
| 1135254 | 4083 | 1.4% |
| 2557581 | 3890 | 1.3% |
| 2666725 | 3864 | 1.3% |
| 2488283 | 3851 | 1.3% |
| Other values (278) | 249525 |
| Value | Count | Frequency (%) |
| 2 | 1 | < 0.1% |
| 4 | 7 | |
| 6 | 4 | < 0.1% |
| 8 | 5 | < 0.1% |
| 10 | 6 | < 0.1% |
| 13 | 14 | |
| 17 | 4 | < 0.1% |
| 21 | 15 | |
| 22 | 6 | < 0.1% |
| 25 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 2666725 | 3864 | |
| 2603360 | 3845 | |
| 2574080 | 3769 | |
| 2557581 | 3890 | |
| 2488283 | 3851 | |
| 2184158 | 3757 | |
| 2066229 | 3810 | |
| 2045000 | 1169 | 0.4% |
| 1990305 | 1167 | 0.4% |
| 1978427 | 3691 |
| Distinct | 3382 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 44396 |
| Missing (%) | 15.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3560.434531 |
| Minimum | 70 |
|---|---|
| Maximum | 287220 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.2 MiB |
Quantile statistics
| Minimum | 70 |
|---|---|
| 5-th percentile | 140 |
| Q1 | 475 |
| median | 1255 |
| Q3 | 4145 |
| 95-th percentile | 13455 |
| Maximum | 287220 |
| Range | 287150 |
| Interquartile range (IQR) | 3670 |
Descriptive statistics
| Standard deviation | 6554.232076 |
|---|---|
| Coefficient of variation (CV) | 1.840851733 |
| Kurtosis | 155.9786254 |
| Mean | 3560.434531 |
| Median Absolute Deviation (MAD) | 1020 |
| Skewness | 7.460251504 |
| Sum | 880990360 |
| Variance | 42957958.1 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 160 | 1842 | 0.6% |
| 105 | 1827 | 0.6% |
| 110 | 1771 | 0.6% |
| 145 | 1375 | 0.5% |
| 180 | 1373 | 0.5% |
| 300 | 1289 | 0.4% |
| 165 | 1258 | 0.4% |
| 125 | 1251 | 0.4% |
| 185 | 1235 | 0.4% |
| 140 | 1222 | 0.4% |
| Other values (3372) | 232996 | |
| (Missing) | 44396 | 15.2% |
| Value | Count | Frequency (%) |
| 70 | 226 | 0.1% |
| 75 | 75 | < 0.1% |
| 80 | 362 | 0.1% |
| 85 | 917 | |
| 90 | 609 | 0.2% |
| 95 | 673 | 0.2% |
| 100 | 887 | |
| 105 | 1827 | |
| 110 | 1771 | |
| 115 | 872 |
| Value | Count | Frequency (%) |
| 287220 | 8 | |
| 148910 | 3 | < 0.1% |
| 142835 | 4 | |
| 122155 | 4 | |
| 116765 | 3 | < 0.1% |
| 109725 | 7 | |
| 108570 | 7 | |
| 107655 | 4 | |
| 101270 | 8 | |
| 95465 | 7 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0215 | 990027054 | 1 | 1 | 1152 | 1536 | 182209 | 240074 | 58975.0 |
| 1 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0114 | 990027044 | 1 | 1 | 15984 | 18633 | 182209 | 240074 | NaN |
| 2 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0316 | 990027021 | 4 | 4 | 1284 | 1734 | 182209 | 240074 | 6840.0 |
| 3 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0218 | 990027019 | 18 | 21 | 251 | 303 | 182209 | 240074 | 3575.0 |
| 4 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0611 | 990027065 | 1 | 1 | 1617 | 1725 | 182209 | 240074 | NaN |
| 5 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0613 | 990027027 | 396 | 399 | 3316 | 3783 | 182209 | 240074 | 8615.0 |
| 6 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0616 | 990027041 | 3 | 3 | 3184 | 4059 | 182209 | 240074 | 20210.0 |
| 7 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0112 | 990027013 | 145 | 146 | 965 | 1137 | 182209 | 240074 | 465.0 |
| 8 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0211 | 990027013 | 2 | 2 | 78 | 118 | 182209 | 240074 | 465.0 |
| 9 | 1.0 | 2022-04-12 | 2022-04-01 | 2012-01-01 | 327 | 0615 | 990027014 | 138 | 149 | 859 | 978 | 182209 | 240074 | 1385.0 |
Last rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 291825 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0317 | 990027199 | 1 | 1 | 3 | 3 | 93 | 93 | NaN |
| 291826 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0615 | 990027199 | 1 | 1 | 1 | 1 | 93 | 93 | NaN |
| 291827 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0415 | 990027198 | 7 | 7 | 9 | 9 | 93 | 93 | NaN |
| 291828 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0115 | 990027199 | 3 | 3 | 4 | 4 | 93 | 93 | NaN |
| 291829 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0616 | 990027199 | 5 | 5 | 5 | 5 | 93 | 93 | NaN |
| 291830 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0713 | 990027198 | 4 | 4 | 27 | 27 | 93 | 93 | NaN |
| 291831 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0115 | 990027198 | 1 | 1 | 4 | 4 | 93 | 93 | NaN |
| 291832 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0116 | 990027199 | 13 | 13 | 13 | 13 | 93 | 93 | NaN |
| 291833 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0415 | 990027199 | 2 | 2 | 9 | 9 | 93 | 93 | NaN |
| 291834 | 1.0 | 2022-04-12 | 2022-04-01 | 2022-01-01 | 327 | 0117 | 990027199 | 2 | 2 | 2 | 2 | 93 | 93 | NaN |